runtime: skip auto-compaction for non-token overflow#2819
Merged
dgageot merged 1 commit intoMay 19, 2026
Conversation
Auto-compaction is only useful when the rejection is a token-count overflow — summarising older turns reduces the input token count. For wire-level overflow ([OverflowKindWire]) the request body itself exceeds the provider's cap, and the latest turn alone is over the limit; the compaction call would have to send the same oversized history and would also be rejected. For media overflow ([OverflowKindMedia]) we have no media-stripping during compaction today, so a retry would resend the same attachment and fail again. In both cases the recovery attempt always fails, then we surface the error anyway, while having spent an extra provider call and several seconds of wall-clock latency. This change skips compaction for those two kinds and surfaces the error directly. The token-overflow path is unchanged.
a252f53 to
dffb7bb
Compare
docker-agent
left a comment
There was a problem hiding this comment.
Assessment: 🟢 APPROVE
The implementation is correct and the logic is well-structured. Both overflow-kind classification and the compaction-skip decision are sound.
What was checked:
OverflowKindtype and constants (tokens,wire,media) — correctclassifyOverflowtwo-tier classification (structured Tier 1 before substring Tier 2, media before wire before tokens) — correct priority orderingOverflowKindOffallback chain for legacy*ContextOverflowErrorwraps — correct; defaults toOverflowKindTokensas historical behaviourhandleStreamErrorskip-compaction branch: wire/media overflows returnstreamErrorFatal→runTurnreturnsturnExit→ outer loop terminates immediately. No circuit-breaker bypass, no infinite retry.FormatErrorswitch: non-overflow errors (OverflowKindOfreturns"") correctly fall through toerr.Error(). All three overflow kinds have actionable messages.classifyErrorCodeswitch covers all three new overflow kinds with distinct error codes.- Test coverage:
TestHandleStreamError_WireOverflowSkipsCompactionandTestHandleStreamError_MediaOverflowSkipsCompactiondirectly exercise the new skip paths.TestClassifyOverflowcovers Tier 1 and Tier 2 patterns including the 413-wins-over-token-prose case.
Hypotheses investigated and dismissed:
- Counter not incremented for wire/media = defeated circuit-breaker: Wire/media overflow always returns
streamErrorFatal, which exits the loop viaturnExit. The counter is irrelevant because there is no re-entry. "content length exceeds"misclassification: That entry is pre-existing, unchanged code (context line in diff, not a+line) — outside the scope of this review.
dgageot
approved these changes
May 19, 2026
1340870
into
docker:trungutt/overflow-kind-classification
16 of 17 checks passed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stacked on #2818. Review after that one merges.
When a request is rejected because the whole request body is too big to send (e.g. HTTP 413, large paste, oversized attachment), the runtime today reacts the same way it reacts to a token-count overflow: it tries to recover by compacting the conversation.
Compaction is itself a model call. It has to send the same conversation history that just got rejected, including the oversized turn. So that call is rejected too. Then the runtime gives up — but only after burning a second provider call and several seconds of wall-clock latency, and the user sees a "context window exceeded — try /compact" message that is actively wrong (compaction is exactly what just failed, and starting a new session is the only thing that helps).
The same trap applies to media-size rejections: there is no media-stripping path during compaction yet, so the offending attachment gets resent and the compaction call fails the same way.
What this change achieves